44 research outputs found

    Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision.

    Get PDF
    Deep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model's reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition

    The spatiotemporal neural dynamics underlying perceived similarity for real-world objects.

    Get PDF
    The degree to which we perceive real-world objects as similar or dissimilar structures our perception and guides categorization behavior. Here, we investigated the neural representations enabling perceived similarity using behavioral judgments, fMRI and MEG. As different object dimensions co-occur and partly correlate, to understand the relationship between perceived similarity and brain activity it is necessary to assess the unique role of multiple object dimensions. We thus behaviorally assessed perceived object similarity in relation to shape, function, color and background. We then used representational similarity analyses to relate these behavioral judgments to brain activity. We observed a link between each object dimension and representations in visual cortex. These representations emerged rapidly within 200 ms of stimulus onset. Assessing the unique role of each object dimension revealed partly overlapping and distributed representations: while color-related representations distinctly preceded shape-related representations both in the processing hierarchy of the ventral visual pathway and in time, several dimensions were linked to high-level ventral visual cortex. Further analysis singled out the shape dimension as neither fully accounted for by supra-category membership, nor a deep neural network trained on object categorization. Together our results comprehensively characterize the relationship between perceived similarity of key object dimensions and neural activity

    The hippocampus as the switchboard between perception and memory.

    Get PDF
    Adaptive memory recall requires a rapid and flexible switch from external perceptual reminders to internal mnemonic representations. However, owing to the limited temporal or spatial resolution of brain imaging modalities used in isolation, the hippocampal–cortical dynamics supporting this process remain unknown. We thus employed an object-scene cued recall paradigm across two studies, including intracranial electroencephalography (iEEG) and high-density scalp EEG. First, a sustained increase in hippocampal high gamma power (55 to 110 Hz) emerged 500 ms after cue onset and distinguished successful vs. unsuccessful recall. This increase in gamma power for successful recall was followed by a decrease in hippocampal alpha power (8 to 12 Hz). Intriguingly, the hippocampal gamma power increase marked the moment at which extrahippocampal activation patterns shifted from perceptual cue toward mnemonic target representations. In parallel, source-localized EEG alpha power revealed that the recall signal progresses from hippocampus to posterior parietal cortex and then to medial prefrontal cortex. Together, these results identify the hippocampus as the switchboard between perception and memory and elucidate the ensuing hippocampal–cortical dynamics supporting the recall process.post-print1844 K

    People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus

    Get PDF
    The functional role of the superior temporal sulcus (STS) has been implicated in a number of studies, including those investigating face perception, voice perception, and face–voice integration. However, the nature of the STS preference for these ‘social stimuli’ remains unclear, as does the location within the STS for specific types of information processing. The aim of this study was to directly examine properties of the STS in terms of selective response to social stimuli. We used functional magnetic resonance imaging (fMRI) to scan participants whilst they were presented with auditory, visual, or audiovisual stimuli of people or objects, with the intention of localising areas preferring both faces and voices (i.e., ‘people-selective’ regions) and audiovisual regions designed to specifically integrate person-related information. Results highlighted a ‘people-selective, heteromodal’ region in the trunk of the right STS which was activated by both faces and voices, and a restricted portion of the right posterior STS (pSTS) with an integrative preference for information from people, as compared to objects. These results point towards the dedicated role of the STS as a ‘social-information processing’ centre

    Electrophysiological evidence for an early processing of human voices

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Previous electrophysiological studies have identified a "voice specific response" (VSR) peaking around 320 ms after stimulus onset, a latency markedly longer than the 70 ms needed to discriminate living from non-living sound sources and the 150 ms to 200 ms needed for the processing of voice paralinguistic qualities. In the present study, we investigated whether an early electrophysiological difference between voice and non-voice stimuli could be observed.</p> <p>Results</p> <p>ERPs were recorded from 32 healthy volunteers who listened to 200 ms long stimuli from three sound categories - voices, bird songs and environmental sounds - whilst performing a pure-tone detection task. ERP analyses revealed voice/non-voice amplitude differences emerging as early as 164 ms post stimulus onset and peaking around 200 ms on fronto-temporal (positivity) and occipital (negativity) electrodes.</p> <p>Conclusion</p> <p>Our electrophysiological results suggest a rapid brain discrimination of sounds of voice, termed the "fronto-temporal positivity to voices" (FTPV), at latencies comparable to the well-known face-preferential N170.</p

    Retrieval induces adaptive forgetting of competing memories via cortical pattern suppression

    Get PDF
    Remembering a past experience can, surprisingly, cause forgetting. Forgetting arises when other competing traces interfere with retrieval and inhibitory control mechanisms are engaged to suppress the distraction they cause. This form of forgetting is considered to be adaptive because it reduces future interference. The effect of this proposed inhibition process on competing memories has, however, never been observed, as behavioral methods are 'blind' to retrieval dynamics and neuroimaging methods have not isolated retrieval of individual memories. We developed a canonical template tracking method to quantify the activation state of individual target memories and competitors during retrieval. This method revealed that repeatedly retrieving target memories suppressed cortical patterns unique to competitors. Pattern suppression was related to engagement of prefrontal regions that have been implicated in resolving retrieval competition and, critically, predicted later forgetting. Thus, our findings demonstrate a cortical pattern suppression mechanism through which remembering adaptively shapes which aspects of our past remain accessible

    Hierarchical organisation of voice and voice gender perception

    No full text
    The most important sound in our auditory environment is the human voice. Voice professionals, whether they are teachers, radio hosts, sport coaches, use their voice on a everyday basis to earn their living and communicate information and knowledge. We grow up spending most of our time everyday listening to voices in school, at the sports club, on t.v., etc. So much that by the time we are adults, voice plays a major role in our everyday social interactions. Yet, while extensive research has been conducted on speech perception voice alone has only just started generating more and more interest in the cognitive neuroscience research community. Voice is not "just" a speech carrier, it conveys rich paralinguistic information such as gender, age, identity or affective state. A theoretical model which emphasises the similarities between face and voice processing was recently introduced, suggesting a serial and parallel processing pathway of voice information leading to high level cognitive processes like person identification. Globally this model of voice processing suggests an extraction of low-level acoustic features, followed by a voice structural encoding leading to parallel pathways for the recognition of speech, affect and identity related information. Furthermore, this model suggested potential interactions with face perception pathways. In this thesis, I investigated two different stages of this voice perception model. First, little is known about the speed at which the distinction between vocal and non-vocal sounds is performed, i.e. is there a time-frame where the "voice structural analysis" would occur. Using electroencephalography, we conducted an experiment in order to delineate this voice vs. non-voice perception time-frame. I observed an early electro-physiological response preferential to voice stimuli, emerging around 164 ms on fronto-temporal electrodes FC5 and FC6 which was termed the "fronto-temporal positivity to voice". Second, little is known about the neural basis behind the perception of paralinguistic information such as identity, gender or affective state contained in the human voice. I used voice gender as a tool to investigated the "voice recognition units" stage of the voice perception model. The cognitive processes behind voice gender perception are still under debate, and more precisely, the nature of the representation of voice gender, whether it is organised around low level acoustical discriminants, or relies on high level categorical representations still remains unclear. Voice gender continua can be created in order to parametrically control the degree of gender contained in voice. I investigated the importance of low level acoustic features using the recently developed auditory morphing algorithms. I averaged 32 male and 32 female voices in order to "approximate" a prototypical voice for each gender. From those prototypes, I generated caricatures by exaggerating the acoustical properties of the male prototype in reference to the female prototype. Those voice composites were included along with 3 pairs of male and female voice exemplars in a voice gender adaptation experiment. I observed significantly stronger perceptual after-effects caused by adaptation from the voice gender caricatures. This result provides evidence for a determinant role of the low level acoustical features in our ability to perceive the gender of a voice. Finally, using functional magnetic resonance imaging (fMRI), I investigated whether brain regions of the auditory cortex are sensitive to voice gender, voice gender adaptation, and whether a dissociation between extraction of acoustical features and higher level, perceptual representations could be achieved. I used voice gender continua and an event-related fMRI design called the continuous carry-over design to assess these working hypotheses. I observed a covariation between BOLD signal and the degree of acoustical differences in consecutive voices in the anterior part of the right superior temporal sulcus, where the extraction of voice gender related acoustical features occurs. Furthermore, I observed a higher level network involving the bilateral inferior frontal gyrus, the insula and the anterior cingulate cortex where a summary of acoustical features would be input from auditory areas enabling a voice gender categorisation
    corecore